Analysis of ×86 instruction set usage for DOS/Windows applications and its implication on superscalar design
نویسندگان
چکیده
The understanding of instruction set usage in typical DOS/Windows applications plays a very important role in designing high performance x86 compatible microprocessors. This paper presents the tools to such analysis, the analysis results, and their implications on the design of a superscalar processor, based on a RISC core, for efficient x86 instruction execution. The analysis tools include monitoring systems for both DOS and Windows 95 applications, either with or without source code. Many commercial software programs are analyzed, including MS Word, MS Excel, Netscape, Netterm, DOS commands, etc. The analyzed results reported in this paper include execution frequencies of x86 instructions, the execution frequency of micro-operations that implement the x86 operations, the average cycles per x86 instruction and per micro operations, and the average number of micro operations in an x86 instruction. The analyzed results are used to determine many important design parameters of the superscalar processor, including the decoder combination and hardware optimization for frequently executed instructions. The analysis guides us to optimize the implementations of register-to-register move, push and pop operations, which result in 17.4% reduction in micro operation cycles and 32% reduction in the loading of the integer unit at very minor hardware cost. The reduction in the integer unit’s loading may result in less number of integer units needed in the superscalar architecture. 1 This work is supported by NSC under contract numbers 85-2262-E-009-010R and 86-2262E-009-009. Accepted to International Conference on Computer Design, 1998
منابع مشابه
Application of Instruction Analysis/Synthesis Tools to x86’s Functional Unit Alloation1
Designing a cost effective superscalar architecture for x86 compatible microprocessors is a challenging task in terms of both technical difficulty and commercial value. One of the important design issues is the measurements of the distribution of functional unit usage and the micro operation level parallelism (MLP), which together determine the proper allocation of functional units in the super...
متن کاملApplication of instruction analysis/scheduling techniques to resource allocation of superscalar processors
This paper presents the development of instruction analysis/scheduling CAD techniques to measure the distribution of functional unit usage and the micro operation level parallelism (MLP), which together determine the proper functional unit allocation for superscalar microprocessors, such as the x86 microprocessors. The proposed techniques fit in the early design exploration phase in which the t...
متن کاملIn-depth analysis of x86 instruction set condition codes influence on superscalar execution
Instruction set design is a crucial aspect of computer architecture. The requirements to fulfill have evolved along time. For superscalar processing the most important feature is to avoid code coupling caused by data dependencies. However, instruction sets may have particular characteristics that produce a negative impact into the amount of available parallelism for which it is important to ana...
متن کاملDynamically Matching ILP Characteristics Via a Heterogeneous Clustered Microarchitecture
Applications vary in the degree of instruction level parallelism (ILP) available to be exploited by a superscalar processor. The ILP can also vary significantly within an application. On one end of the microarchitecture space are monolithic superscalar designs that exploit parallelism within an application. At another end of the spectrum are clustered architectures having many simple cores that...
متن کاملPLX: An Instruction Set Architecture and Testbed for Multimedia Information Processing
PLX is a concise instruction set architecture (ISA) that combines the most useful features from previous generations of multimedia instruction sets with newer ISA features for high-performance, low-cost multimedia information processing. Unlike previous multimedia instruction sets, PLX is not added onto a base processor ISA, but designed from the beginning as a standalone processor architecture...
متن کامل